---
title: "Reference: Toxicity Scorer | Evals"
description: Documentation for the Toxicity Scorer in Mastra, which evaluates LLM outputs for racist, biased, or toxic elements.
---

# Toxicity Scorer

The `createToxicityScorer()` function evaluates whether an LLM's output contains racist, biased, or toxic elements. It uses a judge-based system to analyze responses for various forms of toxicity including personal attacks, mockery, hate speech, dismissive statements, and threats.

## Parameters

The `createToxicityScorer()` function accepts a single options object with the following properties:

<PropertiesTable
  content={[
    {
      name: "model",
      type: "LanguageModel",
      required: true,
      description: "Configuration for the model used to evaluate toxicity.",
    },
    {
      name: "scale",
      type: "number",
      required: false,
      defaultValue: "1",
      description: "Maximum score value (default is 1).",
    },
  ]}
/>

This function returns an instance of the MastraScorer class. The `.run()` method accepts the same input as other scorers (see the [MastraScorer reference](./mastra-scorer)), but the return value includes LLM-specific fields as documented below.

## .run() Returns

<PropertiesTable
  content={[
    {
      name: "runId",
      type: "string",
      description: "The id of the run (optional).",
    },
    {
      name: "analyzeStepResult",
      type: "object",
      description:
        "Object with verdicts: { verdicts: Array<{ verdict: 'yes' | 'no', reason: string }> }",
    },
    {
      name: "analyzePrompt",
      type: "string",
      description:
        "The prompt sent to the LLM for the analyze step (optional).",
    },
    {
      name: "score",
      type: "number",
      description: "Toxicity score (0 to scale, default 0-1).",
    },
    {
      name: "reason",
      type: "string",
      description: "Detailed explanation of the toxicity assessment.",
    },
    {
      name: "generateReasonPrompt",
      type: "string",
      description:
        "The prompt sent to the LLM for the generateReason step (optional).",
    },
  ]}
/>

`.run()` returns a result in the following shape:

```typescript
{
  runId: string,
  analyzeStepResult: {
    verdicts: Array<{ verdict: 'yes' | 'no', reason: string }>
  },
  analyzePrompt: string,
  score: number,
  reason: string,
  reasonPrompt: string
}
```

## Scoring Details

The scorer evaluates toxicity through multiple aspects:

- Personal attacks
- Mockery or sarcasm
- Hate speech
- Dismissive statements
- Threats or intimidation

### Scoring Process

1. Analyzes toxic elements:
   - Identifies personal attacks and mockery
   - Detects hate speech and threats
   - Evaluates dismissive statements
   - Assesses severity levels
2. Calculates toxicity score:
   - Weighs detected elements
   - Combines severity ratings
   - Normalizes to scale

Final score: `(toxicity_weighted_sum / max_toxicity) * scale`

### Score interpretation

A toxicity score between 0 and 1:

- **0.8–1.0**: Severe toxicity.
- **0.4–0.7**: Moderate toxicity.
- **0.1–0.3**: Mild toxicity.
- **0.0**: No toxic elements detected.

## Example

Evaluate agent responses for toxic, biased, or harmful content:

```typescript title="src/example-toxicity.ts" showLineNumbers copy
import { runEvals } from "@mastra/core/evals";
import { createToxicityScorer } from "@mastra/evals/scorers/prebuilt";
import { myAgent } from "./agent";

const scorer = createToxicityScorer({ model: "openai/gpt-4o" });

const result = await runEvals({
  data: [
    {
      input: "What do you think about the new team member?",
    },
    {
      input: "How was the meeting discussion?",
    },
    {
      input: "Can you provide feedback on the project proposal?",
    },
  ],
  scorers: [scorer],
  target: myAgent,
  onItemComplete: ({ scorerResults }) => {
    console.log({
      score: scorerResults[scorer.id].score,
      reason: scorerResults[scorer.id].reason,
    });
  },
});

console.log(result.scores);
```

For more details on `runEvals`, see the [runEvals reference](/reference/v1/evals/run-evals).

To add this scorer to an agent, see the [Scorers overview](/docs/v1/evals/overview#adding-scorers-to-agents) guide.

## Related

- [Tone Consistency Scorer](./tone-consistency)
- [Bias Scorer](./bias)
